Averaged Least-Mean-Squares: Bias-Variance Trade-offs and Optimal Sampling Distributions

نویسندگان

  • Alexandre Défossez
  • Francis R. Bach
چکیده

We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent. In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the generalization error may be divided into a variance term which is decaying as O(1/n), independently of the step-size γ, and a bias term that decays as O(1/γn); (c) when allowing non-uniform sampling of examples over a dataset, the choice of a good sampling density depends on the trade-off between bias and variance: when the variance term dominates, optimal sampling densities do not lead to much gain, while when the bias term dominates, we can choose larger step-sizes that lead to significant improvements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constant Step Size Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions

We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent (a.k.a. least-mean-squares). In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives...

متن کامل

Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression

We consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting the initial conditions in O(1/n)...

متن کامل

Uniform CR Bound: Implement ation Issues And Applications

We apply a uniform Cramer-Rao (CR) bound [l] to study the bias-variance trade-offs in single photon emission computed tomography (SPECT) image reconstruction. The uniform CR bound is used to specify achievable and unachievable regions in the bias-variance trade-off plane. The image reconstruction algorithms considered in this paper are: 1) Space alternating generalized EM and 2) penalized weigh...

متن کامل

Generalized Spatial Two Stage Least Squares Estimation of Spatial Autoregressive Models with Autoregressive Disturbances in the Presence of Endogenous Regressors and Many Instruments

This paper studies the generalized spatial two stage least squares (GS2SLS) estimation of spatial autoregressive models with autoregressive disturbances when there are endogenous regressors with many valid instruments. Using many instruments may improve the efficiency of estimators asymptotically, but the bias might be large in finite samples, making the inference inaccurate. We consider the ca...

متن کامل

Averaged Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions Supplementary material

Throughout our results we will use the following notations and results. These are necessary to provide explicit expressions for the constants in the asymptotic expansions. For any real vector space V of finite dimension d, let M(V ) be the space of linear operators over V which is isomorphic to the space of d-by-d matrices, with the usual results that composition becomes matrix multiplication. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015